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P ' Abstract 
^" 

\ Many studies assume that the solar irradiance in the EUV can be decomposed into 

different contributions, which makes the modelling of the spectral variability considerably 
easier. We consider a different approach, in which these contributions are not imposed a 
^ \ priori but are effectively and robustly inferred from spectral irradiance measurements. This 

is a source separation problem with a positivity constraint, for which we use a Bayesian 
\f~^ ■ solution. Using five years of daily EUV spectra recorded by the TIMED/SEE satellite, 

I we show that the spectral irradiance can be decomposed into three elementary spectra. 

0^ ' Our results suggest that they describe different layers of the solar atmosphere rather than 

^ ■ specific regions. The temporal variability of these spectra is discussed. 

o 
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H ■ 1 Motivation 



The solar extreme ultraviolet (EUV) irradiance is the primary energy input for the di- 
urnal ionosphere and one of the key parameters for space weather. Any variation in the 
incoming EUV flux at the top of Earth's atmosphere modifies the state of the thermo- 
sphere/ionosphere system and can affect human activities such as radio telecommunication 
and orbitography (through satellite drag). The knowledge of the spectral irradiance in real 
time is compulsory for mitigating its potentially harmful effects. 

The EUV flux, however, can only be measured from space. Several approaches have 
been developed to overcome this dependence on space-borne instruments. Historically, 
and driven in part by th e lack of measureme nts, indices such as the radio flux at 10.7 
cm ( Richards et al. 1994 : Tobiska et al. 200d ) have been used as proxies for the solar 



EUV flux. For a long time, these indices have provided ionospheric physicists with very 
useful inputs for their models. These indices, however, also have intrinsic limits because 
of the difference between the physical processes that give rise to them and to the EUV 
emission. Furthermore, the observation of the solar disk in the EUV, notably through 
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the SoHO satellite, has revealed tremendous heterogeneity and dynamics in this spectral 
range. It then comes as no surpris e that the whole EUV spectrum variability ca n hardly 



be reproduced with a single proxy ( Flovd et al. 2005 : Dudok de Wit et al. 20081 ) 



A different approach consists in decomposing the solar spectral irradiance into the 
sum of contributions that c ome from different r e gions , each of which has a characteris- 
tic spectrum . For example, Vernazza fc: Reevei ( 19781 ). using SKYLAB data, and later 
Curdt et al.l (|200lh . using SOHO/SUMER data, empirically decomposed the Sun into 
three regions (quiet Sun, coronal holes, and active regions) a nd associate d a ty pical 



spectrum to each of them. Sim ilar ideas wer e put forward by iLean et al.l (jl982l ) and 



Woods et al 



tOQd ). This has led lWarren et al.l (l200lh to model the solar EUV irradiance 
as a linear combination of three spectra that are again a ssoci ated with the quiet Sun, 
coronal holes, and active regions. Kretzschmar et al. (2004) and Warren ( 20051 ) have pur- 
sued these studies. Good agreement has been found with other models and measurements 
dWoods et al.ll2005l ). A similar strategy has been used for the near-UV and visible range. 



where the so lar surface has be en decoi nposed into photosp heric features such as sunspots 



and faculae ( Fontenla Sz Hard er 2005;; Wenzler et al. 20061 ) . 



All these studies, however, rely on the rather subjective choice of solar regions and on 
the assumption that these may be associated with characteristic (or elementary) spectra. 
This strong constraint has always been justified through empirical arguments. A first 
problem here is to determine the number of elementary spectra. A second problem is to 
define the solar regions and the resolution needed to resolve them. One may, for example, 
wonder how small the solar features should be to properly explain the variability of the 
whole disk. This is a kind of endless problem, since the better the resolution, the finer the 
structure and the stronger the dynamics. 

To bypass these problems, we follow a novel approach. Instead of starting from a pre- 
defined set of solar regions that are guessed from empirical knowledge, we use a statistical 
method to determine if the solar EUV spectrum can be decomposed at all, and to extract 
its different components. A major difference with respect to previous approaches is the 
identification of elementary spectra that are based on only the statistical properties of 
the solar spectral dynamics, without any a priori on the number or on the shape of these 
spectra. In this sense, the method is less biased and we are more likely to discover new 
and unsuspected aspects of the solar variability in the EUV. 

The method we use is based on a recent and powerful mathematical concept called 
Bayesian positive source separation (BPSS), allows us here to decompose the solar EUV 
spectral variability into a linear superposition of contributions. The motivation of this 
letter is twofold. First, we show that three elementary spectra are sufficient for repro- 
ducing the salient features of the EUV spectral variability. Second, we show that these 
elementary spectra, which are determined by statistical means alone, actually have a phys- 
ical interpretation. Our results suggest that they describe different volumes of the solar 
atmosphere rather than specific regions. 



2 Positive source separation and its application to the EUV 
solar spectrum analysis 

We consider the five years of daily solar spectral irradianc e measurements sin ce Feb. 2002 
by the Solar EUV Experiment (SEE) onboard TIMED dWoods et all l2005l l. The EUV 



Grating Spectrograph (EGS), which is part of SEE, measures the spectrum from 25 to 
195 nm with a 0.4 nm spectral resolution. We use level 2 data (version 9), in which the 
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spectral irradiance is provided from 25 to 195 nm with a 0.1 nm spectral step. Solar 
flares are excluded from this data set. Some wavelengths are missing around the strong 
HI Lyman-a line for instrumental reasons. The signal-to-noise ratio gradually decreases 
in time because of instrument degradation and the declining solar cycle. Our results, 
however, remain unchanged when the analysis is performed only on the first half of the 
data set. 

The spectral irradiance data from TIMED/SEE are stored in a matrix I{t,X) of size 
{rit = 2146, nx = 1546) where t denotes time and A denotes wavelength. Our objective is to 
decompose each spectrum as a linear combination of rig elementary spectra or, equivalently, 
to decompose the matrix I{t, A) as a product of two matrices V{t) and 5(A), of respective 
sizes {nt,ne) and {nf>,n\). Each line of the matrix ^(A) contains an elementary spectrum, 
and its associated time variability is stored in the corresponding column of the matrix 
V{t). These matrices are positive, in the sense that all their entries are po sitive. This 
problem is known as positive matrix fact orization ( Paatero &: Tapper 1994j ) or positive 



source separation ( Moussaoui et al. 20061 ). 



The approach to solve this factorization problem is based on Bayesian estimation 
theory (jOelman et al.ll2003l ). We improve the data model by including an additive noise 



term B{t, A), which corresponds to measurement noise and to data modelling errors. In the 
following, we assume that the entries B{t, A) are independent, zero-mean Gaussian random 
variables. The positive source separation problem then consists in finding the matrices 
V{t) and S{X) from the knowledge only of the data /(t. A) and under the assumption 
of the mo delling equation /(t , A) = V{t) x 5(A) + B{t,X}. According to the Bayesian 
paradigm dCelman et al.ll2003l ^. we assume that V{t) and 5(A) are random matrices, and 



the assessment of these matrices is to be understood in a probabilistic sense. In other 
words, the problem is solved if we know the joint probability distribution of V{t) and 
5(A), given the data I(i,A), called the a posteriori distribution. According to Bayes' 
theorem, the a posteriori distribution writes (omitting t and A for the sake of clarity) 



P (5, V\i) = P [r\S, v)xP (5, V) /P{I). 



In this equation, P {^I\S,Vj is called the likelihood function and will be known if the 

observation equation or modelling equation is known. Here, since I = V x S + B, the 
likelihood function is simply given by the distribution of the noise term with mean V x S, 
or Pj^{T — V X S). The second term, P (^S,V^, is called the a priori distribution of the 
parameters we are looking for. This distribution has to be chosen carefully, according to 
the a priori knowlegde we have on the variabilities V and the elementary spectra 5. We 
assume here that these spectra and the variabilities are statistically independent random 
matrices, so that their joint distribution factorizes into P{S) x P{V). Furthermore, since 
we are looking for positive quantities, we impose that the distributions are zero for negative 
values of any of their arguments. Typically, we assume that the entries of the matrices are 
independent random variables, and identically distributed according to Gamma probability 
density functions. 

All the assumptions described above allow us to write the a posteriori distribution 
and knowing this distribution means that all the information contained in the 

data I about the parameters 5 and V are known. However, a pragmatic point of view 
imposes point estimates of the matrices V{t) and 5(A). Such estimates are obtained from 
the a posteriori distribution by using so-called Bayesian estimators, among which the 
most famous are the minimum mean square error estimator (MMSE) and the maximum 
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a posteriori estimator (MAP). The former can be shown to be the a posteriori mean, i.e. 
the mean of the a posteriori distribution, and the latter is the given by the parameters 
that maximize the a posteriori distribution. Here, we choose the MMSE estimator, e.g. 



V = J VP(v\lj dV. 



In practice, the a posteriori distribution hes in a high dimensional space and is mathe- 
matically so complex that the Bayesian estimators cannot be evaluated t heoretically. Nu- 



meric al approximations are needed, and for several r easons exposed in (jMoussaoui et al. 



20061 ). a Markov chain Monte-Carlo algorithm is used (jCelman et al.ll20()g ). Such an algo 



rithm provides a multidimensional Markov chain M{n),n > such that the distribution of 
M(n) at iteration n is close to the a posteriori distribution P (^S, V\lj ■ The design of the 
chain ensures that these distributions coincide asymptotically (n +00). Furthermore, if 
the chain is correctly designed, the coincidence can be obtained rapidly {n coincidence ~ 10^ 
iterations). The outputs obtained after the coincidence are then used to perform a sample 
mean. For example, if M[n) = (y(n), S'(n)), we obtain an approximate MMSE estimator 
of the variabilites via 

^ '^coincidence'^ 

^ coincidence'^'^ 

This algorithm is explained in iMoussaoui et al.l (j2006l ). and its source code is available on 
request. 



3 Solar EUV irradiance decomposition results 

The first key question is how many elementary spectra are needed to properly repro- 
duce the solar spectral variability. This can be answered in two different ways. First 
we decompose the spectral irradiance into rie elementary spectra, then compute the 
normalized difference between the measured and the reconstructed spectra e{t, A) = 
{I{t, A) — ^27=1 Vii't)Si{X)) /I{t, A) and subsequently consider the normalised mean square 
error J = {e'^{t, X))t,x, averaged over time and all wavelengths. For reconstructions with 
respectively ne=l up to ne=5 elementary spectra, the normalised mean square error equals 
3.5%, 0.36%, 0.21%, 0.18%, and 0.13%. With one single spectrum, the decomposition is 
trivial. Two spectra clearly improve the quality of the fit. Some improvement is still 
noticeable with ne=3 spectra, but the error then levels off because the model starts fitting 
noise. Thus, according to the error criterion J, the number of elementary spectra should 
be two or three. 

The number of spectra can also be determined by inspection. With two sources, one 
elementary spectrum reproduces the quiet Sun, and the other captures a blend of coronal 
and transition region lines. With three sources, as we shall see below, the sources clearly 
separate lines that are generated at different temperatures. With four sources, one of 
the spectra becomes degenerated, as it appears twice with almost the same conten10. We 
conclude that the spectral variability between 25 and 195 nm is best described by the 
superposition of three elementary spectra. From now on, we stick to these three spectra. 

The three elementary spectra S'j(A), estimated using BPSS, are shown in Fig.[Tl with an 
excerpt in Fig. [2j The first elementary spectrum (SI) looks similar to the time-averaged 

^The difFerence between the two spectra captures a small instrumental perturbation caused by tem- 
perature variations in the spacecraft. This effect and solutions for mitigating it with the BPSS will be 
discussed in a forthcoming paper. 
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Figure 1: The three elementary spectra (S1-S3, m arbitrary units) and the solar spectrum, 
all averaged over the whole time span. Crosses denote 38 intense spectral lines. 



EUV spectrum, but it is not: SI reproduces most of the strongest contributions, such 
as the intense HI Lyman-a line (121.57 nm), and the thermal continuum above 130 nm. 
FigureO however, shows that hot coronal lines such as Fe XV (28.45 and 41.75 nm) and Fe 
XVI (33.55 and 36.05 nm) are significantly reduced or even totally lacking, while transition 
region lines such as Ne VII (46.55 nm) and, for instance, the HI Lyman continuum are still 
present. Since a major fraction of SI comes from the cooler part of the solar atmosphere, 
where most of the EUV radiation originates, we interpret it as an average inactive Sun, 
defined as a full Sun, without important signs of activity. 

The second elementary spectrum S2 is more enigmatic, since it captures the thermal 
continuum, but not very many spectral lines apart from chromospheric ones such, as Si II 
(180.85 and 181.65 nm) and the wings of He II (30.35 nm). As we see later, this second 
spectrum mostly captures the contribution from the coolest part of the chromosphere. 

The third elementary spectrum S3 stands out by the absence of the thermal continuum 
and the marked presence of hot coronal lines, such as Fe XVI (33.55 and 36.05 nm) and also 
Si XII (49.95 and 52.05 nm). Note the absence of the thermal background contribution 
and how the wings of blended lines are rejected. The third spectrum can therefore be 
interpreted as a contribution from hot coronal emissions. 

Clearly, our three elementary spectra do not correspond to specific regions of the Sun, 
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Figure 2: Excerpt of Fig. [H showing the detail of the elementary spectra between 28 and 
80 nm, as well as the average solar spectrum. 
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Figure 3: For each of the 38 intense spectral Unes shown in Fig. [TJ characteristic emission 
temperature versus their relative contribution to each of the three elementary spectra. 
Some lines have been omitted to avoid excessive cluttering. The temperature is estimated 
using the CHIANTI model, and the characteristics of the SEE spectrometer. Lines with 
a bimodal temperature response are not shown on this plot. 



and so cannot be directly compared to reference spectra as obtained from single instru- 
ments. To put our interpretations on firmer ground, we consider the effective temperature 
of 38 intense lines in Fig. [3l For each of them, we plot the contribution to the three 
elementary spectra, relative to the measured average spectrum, versus the effective tem- 
perature. The latter takes the finite spectral resolution of TIMED/SEE into account for 
blended lines. 

Figure [3] confirms the temperature ordering of the elementary sources, as discussed 
above. The first elementary spectrum emphasises chromospheric and transition region 
lines, while S2 selects only the coolest chromospheric lines and S3 hot coronal lines, so 
from statistical properties only of the spectral variability, we can decompose the solar 
spectral irradiance into a unique set of three elementary contributions that correspond to 
specific temperature bands. To the best of our knowledge, this is the first proof of the 
existence of such a decomposition by rigourous means. 

Let us now investigate how the contributions of the elementary spectra evolve in time. 
Figure m shows the absolute contribution V^, as well as the relative contribution Vk/{Vi + 
V2+V3) for each spectrum. The data set covers the declining phase of the solar cycle, during 
which the irradiance drops at all wavelengths. As expected, the relative contribution of 
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S3 gradually decreases in time, as active regions become scarce. The relative amplitude 
of SI stays remarkably constant. This was to be expected, as SI is dominated by the 
contribution from the dominant Lyman-a line (121.57 nm) and the thermal continuum. 
As a consequence of this, source S2 increases, both in relative and in absolute terms. This 
increase points toward a stronger contribution of the cold chromosphere at solar minimum. 

To validate the interpretation of the elementary spectra, we compare their intensity 
Vfc(t) to various proxies for solar activity, using the cosine distance 



l.y = {x{t)y{t))/ U{x^{t))^{y\t)) 



rather than the usual correlation coefficient. Both quantities have the same interpreta- 
tion and are bounded by [—1,1]; the cosine distance, however, takes into account the 
magnitude of the relative variability during the solar cycle. Although none of the indices 
can satisfactorily reproduc e the spectral variability on both short and long time scales 
(|Dudok de Wit et aljlioosl l. high values of Ixy should nevertheless hint at the origin of 



the spectra. 

The two indices that are most stron gly correlated wit h the first source are by far the 
Mgll core-to-wing (jViereck et al.ll2nnih and the Call K (|Lean et al.lll982l ) indices, with 



Ixy = 0.995 and '^xy = 0.996, respectively. Both indices are indeed known to reproduce 
the UV spectrum well. The second source does fit no known index, since it increases 
over the declining cycl e. The third source , however, is strongly correlated with the Mount 
Wilson Sunspot Index ( Parker et al. 19981 ) and to a lesser degree with the radiometric flO.7 



index ( Tobiska et al. 2000 l). with '~^xy = 0.98 and '')xy = 0.95, respectively. Both indices 



quantify only the contribution from active regions. These results therefore fully support 
our interpretation of the first and the third spectra. They also suggest the possibility of 
reconstructing the spectrum and its variability using properly chosen proxies. This should 
be straightforward for sources SI and S3, whereas more work is needed to model source 
S2. We are currently working on this problem. 

4 Discussion and outlooks 

The central result of this study is the possibility to describe the EUV spectral variability 
in terms of only three spectra. This confirms (and in some sense justifies) a long standing 
and purely intuitive practice that consisted in partitioning the Sun into three components. 
Our approach provides a new interpretation of these components, excluding any a priori 
bias. The prevalent viewpoint assumed a horizontal structuring of the solar emission in 
terms of coronal holes, active regions, active network, etc. Our results instead support 
the existence of emitting volumes, with a more vertical structuring. The difference is 
important, as it raises the questions of the energy exchange between the lower atmosphere 
and the corona and of its evolution with the solar cycle. This interpretation will be 
developed in a forthcoming paper. 

Our results also pave the way for a new instrumental concept in which the solar EUV 
spectrum would be reconstructed only from a small set of lines or spectral bands, rather 
than using a full-fiedged spectrometer. The reason for this is t he remarkable redundancy o f 
the spectral variability, which has already been revealed by Dudok de Wit et al. ( 20051 ). 



using multivariate statistical analysis. Such a redundancy necessarily implies a strong 
connection between the physical processes at different solar atmospheric layers. 

Another interesting result is the behaviour of the second and third elementary spectra 
in Fig. m which supports a gradual migration of the origin of EUV fiux from the low 
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Figure 4: Upper plot: absolute contribution of each source to the irradiance. Lower plot: 
Relative contribution of each source to the irradiance. The three terms add up to 100%. 

corona and high transition region to the low transition region and high chromosphere. In 
a forthcoming study, we shall determine how the heat flux in the transition region could 
be forced by this behaviour along the solar cycle. But first, it should be confirmed during 
the decreasing part of the solar cycle. And this leads to the following issue: does the 
minimum of S2 coincide with solar maximum, if any, or could it be the crossing between 
S2 and S3? We probably will need more than a solar cycle of observations to answer this 
question. 
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